Renewable energy sources play an important role in the global energy mix, aiming to reduce the environmental impact of energy production.
Among renewable energy options, wind energy is one of the most developed. The U.S. Department of Energy has provided a guide to achieving operational efficiency using predictive maintenance.
Predictive maintenance uses sensor data and analysis to measure and predict degradation and component capability. The idea is that if failure patterns can be predicted and the component replaced before failure, operational and maintenance costs will be lower.
Sensors in energy generation machines collect data on environmental factors (temperature, humidity, wind speed, etc.) and features related to wind turbine components (gearbox, tower, blades, brake, etc.).
ReneWind is working on improving processes in wind energy production using machine learning. They have collected data on wind turbine generator failures using sensors. A ciphered version of this data was shared due to confidentiality.
The goal is to build and tune classification models to find the best one that identifies failures early, so generators can be repaired before breaking down, reducing maintenance cost.
The cost of repair is lower than replacement, and inspection cost is lower than repair.
1 → Failure 0 → No FailureThe data provided is a transformed version of the original data which was collected using sensors.
Both datasets consist of 40 predictor variables and 1 target variable.
## 📚 Import Libraries
# Data manipulation
import pandas as pd
import numpy as np
# Visualization
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import cm
from matplotlib.colors import Normalize
# Preprocessing and model evaluation
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
# Deep learning
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.losses import BinaryCrossentropy
# To be used for missing value imputation
from sklearn.impute import SimpleImputer
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.metrics import Recall
from sklearn.inspection import permutation_importance
from sklearn.metrics import ConfusionMatrixDisplay
from sklearn.metrics import (
confusion_matrix,
roc_curve,
precision_recall_curve,
auc,
precision_score,
recall_score,
f1_score,
roc_auc_score,
classification_report
)
import warnings
warnings.filterwarnings("ignore")
# Load data
train_df = pd.read_csv("Train.csv")
test_df = pd.read_csv("Test.csv")
print("Train Shape:", train_df.shape)
print("Test Shape:", test_df.shape)
Train Shape: (20000, 41) Test Shape: (5000, 41)
print("\nTrain Head:")
train_df.head()
Train Head:
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V32 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -4.464606 | -4.679129 | 3.101546 | 0.506130 | -0.221083 | -2.032511 | -2.910870 | 0.050714 | -1.522351 | 3.761892 | ... | 3.059700 | -1.690440 | 2.846296 | 2.235198 | 6.667486 | 0.443809 | -2.369169 | 2.950578 | -3.480324 | 0 |
| 1 | 3.365912 | 3.653381 | 0.909671 | -1.367528 | 0.332016 | 2.358938 | 0.732600 | -4.332135 | 0.565695 | -0.101080 | ... | -1.795474 | 3.032780 | -2.467514 | 1.894599 | -2.297780 | -1.731048 | 5.908837 | -0.386345 | 0.616242 | 0 |
| 2 | -3.831843 | -5.824444 | 0.634031 | -2.418815 | -1.773827 | 1.016824 | -2.098941 | -3.173204 | -2.081860 | 5.392621 | ... | -0.257101 | 0.803550 | 4.086219 | 2.292138 | 5.360850 | 0.351993 | 2.940021 | 3.839160 | -4.309402 | 0 |
| 3 | 1.618098 | 1.888342 | 7.046143 | -1.147285 | 0.083080 | -1.529780 | 0.207309 | -2.493629 | 0.344926 | 2.118578 | ... | -3.584425 | -2.577474 | 1.363769 | 0.622714 | 5.550100 | -1.526796 | 0.138853 | 3.101430 | -1.277378 | 0 |
| 4 | -0.111440 | 3.872488 | -3.758361 | -2.982897 | 3.792714 | 0.544960 | 0.205433 | 4.848994 | -1.854920 | -6.220023 | ... | 8.265896 | 6.629213 | -10.068689 | 1.222987 | -3.229763 | 1.686909 | -2.163896 | -3.644622 | 6.510338 | 0 |
5 rows × 41 columns
print("\nTest Head:")
test_df.head()
Test Head:
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V32 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -0.613489 | -3.819640 | 2.202302 | 1.300420 | -1.184929 | -4.495964 | -1.835817 | 4.722989 | 1.206140 | -0.341909 | ... | 2.291204 | -5.411388 | 0.870073 | 0.574479 | 4.157191 | 1.428093 | -10.511342 | 0.454664 | -1.448363 | 0 |
| 1 | 0.389608 | -0.512341 | 0.527053 | -2.576776 | -1.016766 | 2.235112 | -0.441301 | -4.405744 | -0.332869 | 1.966794 | ... | -2.474936 | 2.493582 | 0.315165 | 2.059288 | 0.683859 | -0.485452 | 5.128350 | 1.720744 | -1.488235 | 0 |
| 2 | -0.874861 | -0.640632 | 4.084202 | -1.590454 | 0.525855 | -1.957592 | -0.695367 | 1.347309 | -1.732348 | 0.466500 | ... | -1.318888 | -2.997464 | 0.459664 | 0.619774 | 5.631504 | 1.323512 | -1.752154 | 1.808302 | 1.675748 | 0 |
| 3 | 0.238384 | 1.458607 | 4.014528 | 2.534478 | 1.196987 | -3.117330 | -0.924035 | 0.269493 | 1.322436 | 0.702345 | ... | 3.517918 | -3.074085 | -0.284220 | 0.954576 | 3.029331 | -1.367198 | -3.412140 | 0.906000 | -2.450889 | 0 |
| 4 | 5.828225 | 2.768260 | -1.234530 | 2.809264 | -1.641648 | -1.406698 | 0.568643 | 0.965043 | 1.918379 | -2.774855 | ... | 1.773841 | -1.501573 | -2.226702 | 4.776830 | -6.559698 | -0.805551 | -0.276007 | -3.858207 | -0.537694 | 0 |
5 rows × 41 columns
# Looking at the statistical summary of all variables
train_df.describe(include='all')
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | ... | V32 | V33 | V34 | V35 | V36 | V37 | V38 | V39 | V40 | Target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 19982.000000 | 19982.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | ... | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 | 20000.000000 |
| mean | -0.271996 | 0.440430 | 2.484699 | -0.083152 | -0.053752 | -0.995443 | -0.879325 | -0.548195 | -0.016808 | -0.012998 | ... | 0.303799 | 0.049825 | -0.462702 | 2.229620 | 1.514809 | 0.011316 | -0.344025 | 0.890653 | -0.875630 | 0.055500 |
| std | 3.441625 | 3.150784 | 3.388963 | 3.431595 | 2.104801 | 2.040970 | 1.761626 | 3.295756 | 2.160568 | 2.193201 | ... | 5.500400 | 3.575285 | 3.183841 | 2.937102 | 3.800860 | 1.788165 | 3.948147 | 1.753054 | 3.012155 | 0.228959 |
| min | -11.876451 | -12.319951 | -10.708139 | -15.082052 | -8.603361 | -10.227147 | -7.949681 | -15.657561 | -8.596313 | -9.853957 | ... | -19.876502 | -16.898353 | -17.985094 | -15.349803 | -14.833178 | -5.478350 | -17.375002 | -6.438880 | -11.023935 | 0.000000 |
| 25% | -2.737146 | -1.640674 | 0.206860 | -2.347660 | -1.535607 | -2.347238 | -2.030926 | -2.642665 | -1.494973 | -1.411212 | ... | -3.420469 | -2.242857 | -2.136984 | 0.336191 | -0.943809 | -1.255819 | -2.987638 | -0.272250 | -2.940193 | 0.000000 |
| 50% | -0.747917 | 0.471536 | 2.255786 | -0.135241 | -0.101952 | -1.000515 | -0.917179 | -0.389085 | -0.067597 | 0.100973 | ... | 0.052073 | -0.066249 | -0.255008 | 2.098633 | 1.566526 | -0.128435 | -0.316849 | 0.919261 | -0.920806 | 0.000000 |
| 75% | 1.840112 | 2.543967 | 4.566165 | 2.130615 | 1.340480 | 0.380330 | 0.223695 | 1.722965 | 1.409203 | 1.477045 | ... | 3.761722 | 2.255134 | 1.436935 | 4.064358 | 3.983939 | 1.175533 | 2.279399 | 2.057540 | 1.119897 | 0.000000 |
| max | 15.493002 | 13.089269 | 17.090919 | 13.236381 | 8.133797 | 6.975847 | 8.006091 | 11.679495 | 8.137580 | 8.108472 | ... | 23.633187 | 16.692486 | 14.358213 | 15.291065 | 19.329576 | 7.467006 | 15.289923 | 7.759877 | 10.654265 | 1.000000 |
8 rows × 41 columns
# Checking the data types
train_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 20000 entries, 0 to 19999 Data columns (total 41 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 V1 19982 non-null float64 1 V2 19982 non-null float64 2 V3 20000 non-null float64 3 V4 20000 non-null float64 4 V5 20000 non-null float64 5 V6 20000 non-null float64 6 V7 20000 non-null float64 7 V8 20000 non-null float64 8 V9 20000 non-null float64 9 V10 20000 non-null float64 10 V11 20000 non-null float64 11 V12 20000 non-null float64 12 V13 20000 non-null float64 13 V14 20000 non-null float64 14 V15 20000 non-null float64 15 V16 20000 non-null float64 16 V17 20000 non-null float64 17 V18 20000 non-null float64 18 V19 20000 non-null float64 19 V20 20000 non-null float64 20 V21 20000 non-null float64 21 V22 20000 non-null float64 22 V23 20000 non-null float64 23 V24 20000 non-null float64 24 V25 20000 non-null float64 25 V26 20000 non-null float64 26 V27 20000 non-null float64 27 V28 20000 non-null float64 28 V29 20000 non-null float64 29 V30 20000 non-null float64 30 V31 20000 non-null float64 31 V32 20000 non-null float64 32 V33 20000 non-null float64 33 V34 20000 non-null float64 34 V35 20000 non-null float64 35 V36 20000 non-null float64 36 V37 20000 non-null float64 37 V38 20000 non-null float64 38 V39 20000 non-null float64 39 V40 20000 non-null float64 40 Target 20000 non-null int64 dtypes: float64(40), int64(1) memory usage: 6.3 MB
V1 to V40), all of type float64, suggesting they represent continuous sensor readings or engineered numeric values.Target and is of type int64, with values likely being 0 (No Failure) or 1 (Failure), confirming this is a binary classification problem.V1 have no missing values, while V1 has 19998 non-null entries, indicating just 2 missing values, which will need to be handled during preprocessing.✅ This step ensures the dataset schema is well-understood before proceeding with analysis, imputation, or modeling.
# Checking for missing values
train_df.isnull().sum()
V1 18 V2 18 V3 0 V4 0 V5 0 V6 0 V7 0 V8 0 V9 0 V10 0 V11 0 V12 0 V13 0 V14 0 V15 0 V16 0 V17 0 V18 0 V19 0 V20 0 V21 0 V22 0 V23 0 V24 0 V25 0 V26 0 V27 0 V28 0 V29 0 V30 0 V31 0 V32 0 V33 0 V34 0 V35 0 V36 0 V37 0 V38 0 V39 0 V40 0 Target 0 dtype: int64
V2) contains null values, with exactly 18 missing entries.V3 to V40) and the target variable are fully populated, containing no missing values.V2 is sufficient and avoids unnecessary modifications to other columns.✅ This confirms the dataset is largely clean and only requires minor imputation before model training.
# Checking for duplicated values
train_df.duplicated().sum()
0
train_df), confirming that the data is clean and free from redundancy.# Checking for missing values
test_df.isnull().sum()
V1 5 V2 6 V3 0 V4 0 V5 0 V6 0 V7 0 V8 0 V9 0 V10 0 V11 0 V12 0 V13 0 V14 0 V15 0 V16 0 V17 0 V18 0 V19 0 V20 0 V21 0 V22 0 V23 0 V24 0 V25 0 V26 0 V27 0 V28 0 V29 0 V30 0 V31 0 V32 0 V33 0 V34 0 V35 0 V36 0 V37 0 V38 0 V39 0 V40 0 Target 0 dtype: int64
test_df) revealed:V1V2📌 These missing values in V1 and V2 can be addressed using imputation (e.g., median strategy) to ensure consistency across training and test data.
✅ These checks ensure both data integrity and model reliability, preventing unintended behavior during inference.
# Checking for duplicated values
test_df.duplicated().sum()
0
test_df), confirming that the data is clean and free from redundancy.def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a star will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
histogram_boxplot(train_df, "V1", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V2", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V3", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V4", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V5", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V6", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V7", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V8", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V9", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V10", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V11", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V12", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V13", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V14", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V15", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V16", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V17", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V18", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V19", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V20", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V21", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V22", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V23", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V24", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V25", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V26", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V27", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V28", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V29", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V30", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V31", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V32", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V33", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V34", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V35", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V36", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V37", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V38", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V39", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "V40", figsize=(12, 7), kde=True, bins=None)
histogram_boxplot(train_df, "Target", figsize=(12, 7), kde=True, bins=None)
train_df.hist(figsize=(20, 15), bins=30)
plt.tight_layout()
plt.show()
0 compared to class 1.# Heatmap of the correlation coefficients between numerical variables in the dataset
cols_list = train_df.select_dtypes(include=np.number).columns.tolist()
plt.figure(figsize=(18, 12))
sns.heatmap(
train_df[cols_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".1f", cmap="coolwarm"
)
plt.show()
# Selected a subset of features for the pair plot for better performance and clarity
selected_features = ['V2', 'V7', 'V18', 'V21','V25', 'V28', 'V40', 'Target']
# Create the pair plot using the selected features
sns.pairplot(train_df[selected_features], hue='Target', diag_kind='kde')
plt.show()
Class Imbalance:
The plots show a significant imbalance between the two classes. Class 0 dominates, consistent with the earlier class distribution analysis.
Separable Features:
V18 and V21 show visible clustering differences between classes. These variables might be strong predictors.V28 and V40 also exhibit some separation and could contribute to class distinction.Overlapping Distributions:
V2, V7, and V25 have a large overlap between classes, suggesting they may be less informative individually.Diagonal KDE Plots (Univariate Distributions):
1 distributions (blue) are flatter due to fewer data points, highlighting class imbalance.V28) display slight skewness.Correlation Patterns:
V28, V40) and (V18, V25) appear to be correlated.V18 and V21 not only show correlation with each other but also with the target variable.V18, V21, and V28 for feature importance — they appear to provide the best separation.# Dividing train data into X and y
X = train_df.drop(["Target"], axis=1)
y = train_df["Target"]
# Splitting train dataset into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.25, random_state=1, stratify=y)
# Checking the number of rows and columns in training and validation sets
print(X_train.shape, X_val.shape)
(15000, 40) (5000, 40)
# Defining X (independent) and y (dependent) variables for test set
X_test = test_df.drop(['Target'], axis=1)
y_test = test_df['Target']
# Checking the number of rows and columns in test set
print(X_test.shape)
(5000, 40)
The following steps were taken to prepare the dataset for model training:
Feature & Target Separation:
train_df, features (X) were extracted by dropping the Target column.Target column was separately assigned to y, which will serve as the label for supervised learning.Train–Validation Split:
train_df) was split into:stratify=y argument to preserve the distribution of the target variable across the split. This helps in maintaining consistent label proportions between training and validation, especially important in classification and imbalanced data settings (though less impactful for regression).Test Set Preparation:
X_test) and target (y_test) were separately defined from the test_df.X_train, X_val, X_test) have consistent feature dimensions: (n, 40), ensuring compatibility with model input.This preprocessing sets a solid foundation for fair and consistent model evaluation during training and final testing.
imputer = SimpleImputer(strategy='median')
# Fit and transform the train data
X_train = pd.DataFrame(imputer.fit_transform(X_train), columns=X_train.columns)
# Transform the validation data
X_val = pd.DataFrame(imputer.transform(X_val), columns=X_train.columns)
# fit and transform the imputer on test data
X_test = pd.DataFrame(imputer.transform(X_test), columns=X_train.columns)
# Checking that there are no missing values in train or test sets
print(X_train.isna().sum().sum())
print("-" * 30)
print(X_val.isna().sum().sum())
print("-" * 30)
print(X_test.isna().sum().sum())
0 ------------------------------ 0 ------------------------------ 0
SimpleImputer with the median strategy was applied to handle missing values in the dataset.✅ This step ensures that the model receives clean and consistent input features without any NaNs or incomplete rows.
The objective of the Renewind project is to develop a classification model that can accurately identify potential wind turbine failures using sensor data. Given the significant cost associated with undetected failures, the model evaluation strategy must be aligned with the business goal of reducing unplanned downtime and minimizing replacement costs.
Rationale:
In this classification problem, a false negative (i.e., failing to predict an actual failure) leads to severe consequences, such as:
As a result, minimizing false negatives is of utmost importance. This translates to maximizing recall, which measures the proportion of actual failures that are correctly identified by the model.
A high recall ensures that most real failures are detected and maintenance can be scheduled proactively, thereby avoiding major breakdowns.
While recall is the primary metric, other evaluation metrics help provide a broader view of the model’s performance and operational trade-offs.
| Metric | Purpose |
|---|---|
| Precision | Indicates how many of the predicted failures were actually correct. Helps manage unnecessary inspection costs. |
| F1 Score | Harmonic mean of precision and recall. Useful when both false positives and false negatives are relevant. |
| ROC-AUC Score | Measures the model's ability to distinguish between failure and non-failure across different thresholds. |
| Business Cost | Reflects real-world impact by assigning weights to TP, FP, and FN outcomes based on operational cost. |
The following table outlines how different prediction outcomes translate to real-world cost implications:
| Outcome | Description | Cost Impact |
|---|---|---|
| True Positive | Correctly predicted failure | Repair cost (e.g., ₹100) |
| False Negative | Missed prediction of actual failure | Replacement cost (e.g., ₹1000) |
| False Positive | Incorrectly predicted failure | Inspection cost (e.g., ₹100) |
| True Negative | Correctly predicted no failure | No cost |
Given this matrix, false negatives are the most costly. Therefore, recall must be prioritized during model evaluation and selection.
| Metric | Reason for Exclusion |
|---|---|
| Accuracy | Can be misleading due to class imbalance. A model predicting all turbines as "No Failure" might show high accuracy while missing all actual failures. |
| RMSE / MAE | These are regression metrics and are not applicable to binary classification tasks. |
| Evaluation Aspect | Metric / Approach | Priority Level |
|---|---|---|
| Primary Metric | Recall | High |
| Complementary Metric | F1 Score | Medium |
| Business Insight | Precision, Business Cost Function | Medium |
| Discrimination Capability | ROC-AUC Score | Medium |
| Discarded Metrics | Accuracy, RMSE, MAE | Not Applicable |
This evaluation strategy ensures the model is selected and tuned based on business-critical requirements. By prioritizing recall, the model focuses on reducing the most costly classification errors, while additional metrics provide operational insights and help fine-tune trade-offs between performance and cost.
# Build base classification model using SGD
model_sgd = Sequential([
Dense(64, activation='relu', input_dim=X_train.shape[1]),
Dense(1, activation='sigmoid') # sigmoid for binary classification
])
# Compile with binary crossentropy loss
model_sgd.compile(optimizer=SGD(learning_rate=0.01), loss=BinaryCrossentropy())
# Train the model
model_sgd.fit(X_train, y_train, epochs=50, batch_size=32,
validation_data=(X_val, y_val), verbose=0)
# Predict on validation set
val_preds_proba = model_sgd.predict(X_val)
val_preds = (val_preds_proba > 0.5).astype(int)
# Evaluation Metrics
precision = precision_score(y_val, val_preds)
recall = recall_score(y_val, val_preds)
f1 = f1_score(y_val, val_preds)
roc_auc = roc_auc_score(y_val, val_preds_proba)
# Confusion Matrix and Business Cost Calculation
tn, fp, fn, tp = confusion_matrix(y_val, val_preds).ravel()
# Output Results
print(f"\nSGD Model Classification Metrics:")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
print(f"ROC AUC: {roc_auc:.4f}")
print(f"\nConfusion Matrix:")
print(f"TP: {tp}, FP: {fp}, FN: {fn}, TN: {tn}")
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 272us/step SGD Model Classification Metrics: Precision: 0.9830 Recall: 0.8309 F1 Score: 0.9006 ROC AUC: 0.9454 Confusion Matrix: TP: 231, FP: 4, FN: 47, TN: 4718
| Metric | Value |
|---|---|
| Precision | 0.9830 |
| Recall | 0.8309 |
| F1 Score | 0.9006 |
| ROC AUC | 0.9454 |
| Predicted Positive | Predicted Negative | |
|---|---|---|
| Actual Positive (1) | TP = 231 | FN = 47 |
| Actual Negative (0) | FP = 4 | TN = 4718 |
# Confusion matrix
cm = confusion_matrix(y_val, val_preds)
plt.figure(figsize=(6, 5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=["Predicted 0", "Predicted 1"],
yticklabels=["Actual 0", "Actual 1"])
plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
The heatmap shows the distribution of classification outcomes:
While the model demonstrates excellent precision and high overall accuracy, the recall rate is moderate, resulting in 47 missed failure cases.
In the context of wind turbine maintenance, these false negatives carry significant business risk due to unplanned breakdowns and costly repairs. Therefore, further recall improvement is necessary, which may be achieved by:
The goal is to reduce false negatives without introducing too many false positives — thereby maintaining cost-efficiency while improving failure detection reliability.
fpr, tpr, _ = roc_curve(y_val, val_preds_proba)
roc_auc = auc(fpr, tpr)
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (area = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc="lower right")
plt.grid(True)
plt.show()
A high ROC AUC value of 0.94 confirms that the model can effectively rank predictions and identify failures across various thresholds. This is particularly valuable in scenarios like predictive maintenance, where different thresholds might be applied depending on risk tolerance or resource constraints.
However, while ROC AUC is a strong overall indicator, it does not directly penalize false negatives, which are costly in the Renewind use case. Therefore, ROC AUC should be used in conjunction with recall and business cost-based metrics to guide final model selection and threshold tuning.
precision_vals, recall_vals, _ = precision_recall_curve(y_val, val_preds_proba)
plt.figure(figsize=(8, 6))
plt.plot(recall_vals, precision_vals, color='purple', lw=2)
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision-Recall Curve')
plt.grid(True)
plt.show()
history = model_sgd.fit(X_train, y_train, epochs=50, batch_size=32,
validation_data=(X_val, y_val), verbose=0)
# Plot training vs validation loss
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Training vs Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Binary Crossentropy Loss')
plt.legend()
plt.grid(True)
plt.show()
def compute_permutation_importance(model, X: pd.DataFrame, y, metric=recall_score, n_repeats=10):
# Get baseline recall
baseline_preds = (model.predict(X).flatten() > 0.5).astype(int)
baseline_score = metric(y, baseline_preds)
importances = []
feature_names = X.columns
for col in feature_names:
score_drops = []
for _ in range(n_repeats):
X_permuted = X.copy()
X_permuted[col] = np.random.permutation(X_permuted[col].values)
permuted_preds = (model.predict(X_permuted).flatten() > 0.5).astype(int)
permuted_score = metric(y, permuted_preds)
score_drops.append(baseline_score - permuted_score) # drop in recall
importances.append(np.mean(score_drops))
return pd.Series(importances, index=feature_names).sort_values(ascending=False)
# Run on validation data
feature_importances = compute_permutation_importance(
model_sgd,
pd.DataFrame(X_val, columns=X_val.columns),
y_val,
metric=recall_score
)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 275us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 203us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 216us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 289us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 195us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 187us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 181us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 189us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 187us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 185us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 186us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 204us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 205us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 193us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 218us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 295us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 354us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 293us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 201us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 201us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 219us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 204us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 218us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 222us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 216us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 217us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 216us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 203us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 194us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 218us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 188us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 184us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 185us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 188us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 183us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 183us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 189us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 182us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 547us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 218us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 184us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 182us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 183us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 188us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 191us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 292us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 594us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 246us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 198us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step
# Plot
plt.figure(figsize=(12, 6))
feature_importances.sort_values().plot(kind='barh')
plt.title("Permutation Feature Importances (based on Recall Drop)")
plt.xlabel("Drop in Recall when Feature is Permuted")
plt.tight_layout()
plt.show()
The top 10 influential features include:
V18, V36, V12, V24, V19, V29, V1, V15, V32, V27.
Features like V9, V17, V37, and V22 show almost zero drop in recall, suggesting minimal contribution to the model’s predictions. These may be redundant or noisy.
📌 Tip: To further validate importance, consider cross-checking these rankings with SHAP values or training a model on just top-N features to evaluate performance trade-offs.
It achieves the following classification metrics on the validation set:
Confusion Matrix:
Comment:
While this base SGD model offers a strong starting benchmark, its shallow architecture and lack of complexity might restrict its ability to detect nuanced failure patterns. The high number of false negatives (47) is a concern in a predictive maintenance setting, where undetected failures can lead to high replacement or repair costs.
To address this, subsequent models should explore:
These enhancements are aimed at improving failure detection, especially reducing false negatives while maintaining a low false positive rate.
def build_model(layers=1, optimizer='adam', dropout=0.0):
model = Sequential()
# Input layer
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))
# Hidden layers
for _ in range(layers - 1):
model.add(Dense(64, activation='relu'))
if dropout > 0:
model.add(Dropout(dropout))
# Output layer for binary classification
model.add(Dense(1, activation='sigmoid'))
# Compile with Recall as the only metric
model.compile(optimizer=optimizer,
loss='binary_crossentropy',
metrics=[Recall()])
return model
configs = [
{'name': 'SGD_1L', 'layers': 1, 'optimizer': SGD(), 'dropout': 0.0},
{'name': 'SGD_3L', 'layers': 3, 'optimizer': SGD(), 'dropout': 0.0},
{'name': 'Adam_3L', 'layers': 3, 'optimizer': Adam(), 'dropout': 0.0},
{'name': 'Adam_3L_Dropout', 'layers': 3, 'optimizer': Adam(), 'dropout': 0.3},
{'name': 'Adam_5L_Dropout', 'layers': 5, 'optimizer': Adam(), 'dropout': 0.3},
{'name': 'SGD_5L_Dropout', 'layers': 5, 'optimizer': SGD(learning_rate=0.01), 'dropout': 0.2},
{'name': 'Adam_2L', 'layers': 2, 'optimizer': Adam(), 'dropout': 0.1},
{'name': 'Adam_4L_Dropout', 'layers': 4, 'optimizer': Adam(), 'dropout': 0.2},
{'name': 'SGD_4L_Dropout', 'layers': 4, 'optimizer': SGD(learning_rate=0.01), 'dropout': 0.1},
{'name': 'Adam_6L_Dropout', 'layers': 6, 'optimizer': Adam(), 'dropout': 0.4},
{'name': 'Adam_6L_Dropout_CW', 'layers': 6, 'optimizer': Adam(), 'dropout': 0.4, 'class_weight': 'balanced'},
{'name': 'SGD_3L_Dropout_CW', 'layers': 3, 'optimizer': SGD(learning_rate=0.01), 'dropout': 0.2, 'class_weight': 'balanced'}
]
from sklearn.utils.class_weight import compute_class_weight
from sklearn.metrics import (
recall_score, precision_score, f1_score, roc_auc_score, confusion_matrix
)
results = []
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
lr_scheduler = ReduceLROnPlateau(monitor='val_loss', patience=5, factor=0.5)
for config in configs:
print(f"\n🔧 Training: {config['name']}")
model = build_model(
layers=config['layers'],
optimizer=config['optimizer'],
dropout=config['dropout']
)
# Compute class weights if required
if config.get('class_weight') == 'balanced':
class_weights_array = compute_class_weight(class_weight='balanced', classes=np.unique(y_train), y=y_train)
class_weight = {i: class_weights_array[i] for i in range(len(class_weights_array))}
else:
class_weight = None
# Train model
model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=100, batch_size=32, verbose=0,
callbacks=[early_stop, lr_scheduler],
class_weight=class_weight
)
# Predict for train and val
y_train_probs = model.predict(X_train).ravel()
y_val_probs = model.predict(X_val).ravel()
y_train_preds = (y_train_probs >= 0.5).astype(int)
y_val_preds = (y_val_probs >= 0.5).astype(int)
# Train Metrics
train_recall = recall_score(y_train, y_train_preds)
train_precision = precision_score(y_train, y_train_preds)
train_f1 = f1_score(y_train, y_train_preds)#http://localhost:8888/notebooks/Desktop/Python_Scripts/Great_Lakes/ReneWind/ReneWind-Copy1.ipynb#
train_roc_auc = roc_auc_score(y_train, y_train_probs)
train_tn, train_fp, train_fn, train_tp = confusion_matrix(y_train, y_train_preds).ravel()
# Val Metrics
val_recall = recall_score(y_val, y_val_preds)
val_precision = precision_score(y_val, y_val_preds)
val_f1 = f1_score(y_val, y_val_preds)
val_roc_auc = roc_auc_score(y_val, y_val_probs)
val_tn, val_fp, val_fn, val_tp = confusion_matrix(y_val, y_val_preds).ravel()
results.append({
'Model': config['name'],
'Train_Recall': train_recall,
'Val_Recall': val_recall,
'Train_Precision': train_precision,
'Val_Precision': val_precision,
'Train_F1': train_f1,
'Val_F1': val_f1,
'Train_ROC_AUC': train_roc_auc,
'Val_ROC_AUC': val_roc_auc,
'Train_TP': train_tp,
'Train_FN': train_fn,
'Train_FP': train_fp,
'Val_TP': val_tp,
'Val_FN': val_fn,
'Val_FP': val_fp
})
results_df = pd.DataFrame(results).sort_values(by='Val_Recall', ascending=False)
print("\n📊 Model Comparison Based on Classification Metrics:")
pd.set_option("display.max_columns", None) # Show all columns
print(results_df)
🔧 Training: SGD_1L 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 217us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 211us/step 🔧 Training: SGD_3L 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 🔧 Training: Adam_3L 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 225us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 224us/step 🔧 Training: Adam_3L_Dropout 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 223us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 224us/step 🔧 Training: Adam_5L_Dropout 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 266us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 🔧 Training: SGD_5L_Dropout 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 256us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 🔧 Training: Adam_2L 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 🔧 Training: Adam_4L_Dropout 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 🔧 Training: SGD_4L_Dropout 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 344us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 🔧 Training: Adam_6L_Dropout 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 253us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 🔧 Training: Adam_6L_Dropout_CW 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 254us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 245us/step 🔧 Training: SGD_3L_Dropout_CW 469/469 ━━━━━━━━━━━━━━━━━━━━ 0s 228us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 📊 Model Comparison Based on Classification Metrics: Model Train_Recall Val_Recall Train_Precision \ 11 SGD_3L_Dropout_CW 0.929087 0.888489 0.912633 7 Adam_4L_Dropout 0.908654 0.870504 0.983095 10 Adam_6L_Dropout_CW 0.914663 0.870504 0.875719 4 Adam_5L_Dropout 0.906250 0.866906 0.981771 5 SGD_5L_Dropout 0.902644 0.866906 0.985564 6 Adam_2L 0.891827 0.863309 0.991979 8 SGD_4L_Dropout 0.907452 0.863309 0.993421 3 Adam_3L_Dropout 0.896635 0.859712 0.993342 2 Adam_3L 0.871394 0.856115 0.975774 9 Adam_6L_Dropout 0.894231 0.856115 0.981530 1 SGD_3L 0.900240 0.852518 0.989432 0 SGD_1L 0.873798 0.845324 0.987772 Val_Precision Train_F1 Val_F1 Train_ROC_AUC Val_ROC_AUC Train_TP \ 11 0.888489 0.920786 0.888489 0.992865 0.954362 773 7 0.979757 0.944410 0.921905 0.977817 0.953883 756 10 0.858156 0.894768 0.864286 0.970485 0.941064 761 4 0.964000 0.942500 0.912879 0.979689 0.958609 754 5 0.960159 0.942284 0.911153 0.967481 0.947972 751 6 0.975610 0.939241 0.916031 0.977134 0.954164 742 8 0.967742 0.948492 0.912548 0.981493 0.948102 755 3 0.979508 0.942514 0.915709 0.976077 0.952625 746 2 0.971429 0.920635 0.910134 0.964009 0.944155 725 9 0.975410 0.935849 0.911877 0.972528 0.951598 744 1 0.955645 0.942731 0.901141 0.980790 0.954046 749 0 0.987395 0.927296 0.910853 0.973365 0.943675 727 Train_FN Train_FP Val_TP Val_FN Val_FP 11 59 74 247 31 31 7 76 13 242 36 5 10 71 108 242 36 40 4 78 14 241 37 9 5 81 11 241 37 10 6 90 6 240 38 6 8 77 5 240 38 8 3 86 5 239 39 5 2 107 18 238 40 7 9 88 14 238 40 6 1 83 8 237 41 11 0 105 9 235 43 3
results_df
| Model | Train_Recall | Val_Recall | Train_Precision | Val_Precision | Train_F1 | Val_F1 | Train_ROC_AUC | Val_ROC_AUC | Train_TP | Train_FN | Train_FP | Val_TP | Val_FN | Val_FP | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11 | SGD_3L_Dropout_CW | 0.929087 | 0.888489 | 0.912633 | 0.888489 | 0.920786 | 0.888489 | 0.992865 | 0.954362 | 773 | 59 | 74 | 247 | 31 | 31 |
| 7 | Adam_4L_Dropout | 0.908654 | 0.870504 | 0.983095 | 0.979757 | 0.944410 | 0.921905 | 0.977817 | 0.953883 | 756 | 76 | 13 | 242 | 36 | 5 |
| 10 | Adam_6L_Dropout_CW | 0.914663 | 0.870504 | 0.875719 | 0.858156 | 0.894768 | 0.864286 | 0.970485 | 0.941064 | 761 | 71 | 108 | 242 | 36 | 40 |
| 4 | Adam_5L_Dropout | 0.906250 | 0.866906 | 0.981771 | 0.964000 | 0.942500 | 0.912879 | 0.979689 | 0.958609 | 754 | 78 | 14 | 241 | 37 | 9 |
| 5 | SGD_5L_Dropout | 0.902644 | 0.866906 | 0.985564 | 0.960159 | 0.942284 | 0.911153 | 0.967481 | 0.947972 | 751 | 81 | 11 | 241 | 37 | 10 |
| 6 | Adam_2L | 0.891827 | 0.863309 | 0.991979 | 0.975610 | 0.939241 | 0.916031 | 0.977134 | 0.954164 | 742 | 90 | 6 | 240 | 38 | 6 |
| 8 | SGD_4L_Dropout | 0.907452 | 0.863309 | 0.993421 | 0.967742 | 0.948492 | 0.912548 | 0.981493 | 0.948102 | 755 | 77 | 5 | 240 | 38 | 8 |
| 3 | Adam_3L_Dropout | 0.896635 | 0.859712 | 0.993342 | 0.979508 | 0.942514 | 0.915709 | 0.976077 | 0.952625 | 746 | 86 | 5 | 239 | 39 | 5 |
| 2 | Adam_3L | 0.871394 | 0.856115 | 0.975774 | 0.971429 | 0.920635 | 0.910134 | 0.964009 | 0.944155 | 725 | 107 | 18 | 238 | 40 | 7 |
| 9 | Adam_6L_Dropout | 0.894231 | 0.856115 | 0.981530 | 0.975410 | 0.935849 | 0.911877 | 0.972528 | 0.951598 | 744 | 88 | 14 | 238 | 40 | 6 |
| 1 | SGD_3L | 0.900240 | 0.852518 | 0.989432 | 0.955645 | 0.942731 | 0.901141 | 0.980790 | 0.954046 | 749 | 83 | 8 | 237 | 41 | 11 |
| 0 | SGD_1L | 0.873798 | 0.845324 | 0.987772 | 0.987395 | 0.927296 | 0.910853 | 0.973365 | 0.943675 | 727 | 105 | 9 | 235 | 43 | 3 |
| Criteria | Best Model(s) |
|---|---|
| Highest Validation Recall | SGD_3L_Dropout_CW (0.8885) |
| Best Generalization (Low Overfitting) | Adam_4L_Dropout |
| Most Cost-Efficient (Low FP + High Recall) | Adam_2L (FP = 6, FN = 38) |
| Best F1 Score | Adam_4L_Dropout (0.9219) |
| Best Precision | Adam_2L (0.9756) |
Deploy Adam_4L_Dropout
Offers the best trade-off of high recall, strong precision, low false positives, and generalization ability. It is highly suited for failure prediction where balanced performance and reliability matter.
Consider Adam_2L
When the cost of false alarms (inspections) is a bigger concern than catching every failure. Its very high precision ensures minimal wasteful checks.
Use SGD_3L_Dropout_CW cautiously
If maximizing recall is the absolute priority (e.g., catching every failure is critical), this model is suitable despite a higher false positive rate.
Avoid models like SGD_1L, SGD_3L, Adam_3L, or Adam_6L_Dropout_CW, as they suffer from overfitting or underfitting, poor recall, or high false negatives — all of which are risky for predictive maintenance in wind turbines.
# Set style
sns.set(style="whitegrid")
plt.rcParams["figure.figsize"] = (12, 6)
# Plotting function
def plot_metric_comparison(df, metric, ylabel):
train_col = f"Train_{metric}"
val_col = f"Val_{metric}"
plt.figure(figsize=(14, 6))
x = df["Model"]
plt.plot(x, df[train_col], marker='o', label=f'Train {metric}', color='blue')
plt.plot(x, df[val_col], marker='s', label=f'Validation {metric}', color='orange')
plt.xticks(rotation=45, ha='right')
plt.ylabel(ylabel)
plt.title(f"{metric} Comparison (Train vs Validation)")
plt.legend()
plt.tight_layout()
plt.show()
# Assuming your results DataFrame is called results_df
plot_metric_comparison(results_df, "Recall", "Recall")
plot_metric_comparison(results_df, "Precision", "Precision")
plot_metric_comparison(results_df, "F1", "F1 Score")
plot_metric_comparison(results_df, "ROC_AUC", "ROC AUC")
In this experiment, we evaluated 12 deep learning model configurations for binary classification by varying:
Each model was assessed based on key classification metrics relevant to the Renewind failure prediction problem, with an emphasis on Recall, F1 Score, false positives (FP), and false negatives (FN) — considering the business impact of missed failure detection and unnecessary inspections.
SGD_3L_Dropout_CW
Adam_4L_Dropout
Adam_5L_Dropout
Adam_6L_Dropout_CW
Adam_2L
Adam_3L_Dropout
Adam_3L
SGD_5L_Dropout
SGD_3L
SGD_1L
Adam_6L_Dropout
| Model | Val Recall | Val F1 | Val FP | Val FN | Comments |
|---|---|---|---|---|---|
| Adam_4L_Dropout | 0.870 | 0.922 | 5 | 36 | ✅ Best trade-off of recall, F1, and low FP |
| SGD_3L_Dropout_CW | 0.888 | 0.888 | 31 | 31 | High recall but very high FP |
| Adam_5L_Dropout | 0.867 | 0.913 | 9 | 37 | Balanced, slight overfitting |
| Adam_2L | 0.863 | 0.916 | 6 | 38 | Lightweight, generalizes well |
| Adam_3L_Dropout | 0.860 | 0.916 | 5 | 39 | Low FP, slightly worse recall |
| Others | < 0.856 | < 0.910 | — | — | Underperforming in critical failure detection |
✅ Recommended model: Adam_4L_Dropout
It delivers the best combination of recall, F1 score, and low false positives, making it ideal for failure detection in a cost-sensitive industrial setting.
Adam_2L and Adam_3L_Dropout are great lightweight backups that perform nearly as well and generalize strongly.
❌ Avoid SGD_3L_Dropout_CW in production due to high false positives despite high recall — leads to inspection overload.
# # Rebuild the final selected model (Adam, 4 layers, dropout = 0.2)
final_model = build_model(layers=4, optimizer=Adam(), dropout=0.20)
# Train on training set, validate on validation set (not test set!)
final_model_history = final_model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=100,
batch_size=32,
callbacks=[early_stop, lr_scheduler],
verbose=0
)
####
# Predict probabilities on test data
test_pred_proba = final_model.predict(X_test)
# Convert probabilities to class labels (threshold = 0.5)
test_preds = (test_pred_proba > 0.5).astype(int)
# Evaluation Metrics
precision = precision_score(y_test, test_preds)
recall = recall_score(y_test, test_preds)
f1 = f1_score(y_test, test_preds)
roc_auc = roc_auc_score(y_test, test_pred_proba)
cm = confusion_matrix(y_test, test_preds)
# Print evaluation results
print("Test Set Evaluation Metrics:")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
print(f"ROC AUC: {roc_auc:.4f}")
print("\nConfusion Matrix:")
print(cm)
# Optional: Full classification report
print("\nClassification Report:")
print(classification_report(y_test, test_preds))
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 316us/step Test Set Evaluation Metrics: Precision: 0.9792 Recall: 0.8333 F1 Score: 0.9004 ROC AUC: 0.9404 Confusion Matrix: [[4713 5] [ 47 235]] Classification Report: precision recall f1-score support 0 0.99 1.00 0.99 4718 1 0.98 0.83 0.90 282 accuracy 0.99 5000 macro avg 0.98 0.92 0.95 5000 weighted avg 0.99 0.99 0.99 5000
# Make predictions
test_pred_proba = final_model.predict(X_test).flatten()
test_preds = (test_pred_proba > 0.5).astype(int)
# Confusion matrix
cm = confusion_matrix(y_test, test_preds)
tn, fp, fn, tp = cm.ravel()
# Precision, Recall, F1, ROC AUC
precision = precision_score(y_test, test_preds)
recall = recall_score(y_test, test_preds)
f1 = f1_score(y_test, test_preds)
roc_auc = roc_auc_score(y_test, test_pred_proba)
# 1. Confusion Matrix Heatmap
plt.figure(figsize=(6, 5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=["Predicted 0", "Predicted 1"],
yticklabels=["Actual 0", "Actual 1"])
plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step
| Predicted 0 | Predicted 1 | |
|---|---|---|
| Actual 0 | 4713 (True Negative) | 5 (False Positive) |
| Actual 1 | 47 (False Negative) | 235 (True Positive) |
# 2. ROC Curve
fpr, tpr, _ = roc_curve(y_test, test_pred_proba)
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC Curve (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=1, linestyle='--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC)')
plt.legend(loc="lower right")
plt.grid(True)
plt.show()
# 3. Precision–Recall Curve
prec_vals, rec_vals, _ = precision_recall_curve(y_test, test_pred_proba)
plt.figure(figsize=(8, 6))
plt.plot(rec_vals, prec_vals, color='purple', lw=2)
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision–Recall Curve')
plt.grid(True)
plt.show()
# 4. Training vs Validation Loss
plt.figure(figsize=(8, 5))
plt.plot(final_model.history.history['loss'], label='Training Loss')
plt.plot(final_model.history.history['val_loss'], label='Validation Loss')
plt.title('Training vs Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Binary Crossentropy Loss')
plt.legend()
plt.grid(True)
plt.show()
# 5. Threshold vs Metrics Curve
thresholds = np.linspace(0, 1, 100)
precisions = []
recalls = []
f1s = []
for thresh in thresholds:
preds_thresh = (test_pred_proba > thresh).astype(int)
precisions.append(precision_score(y_test, preds_thresh, zero_division=0))
recalls.append(recall_score(y_test, preds_thresh))
f1s.append(f1_score(y_test, preds_thresh))
plt.figure(figsize=(10, 6))
plt.plot(thresholds, precisions, label='Precision')
plt.plot(thresholds, recalls, label='Recall')
plt.plot(thresholds, f1s, label='F1 Score')
plt.xlabel('Threshold')
plt.ylabel('Score')
plt.title('Precision, Recall, and F1 Score vs Threshold')
plt.legend()
plt.grid(True)
plt.show()
At very low thresholds (< 0.1):
As the threshold increases from 0.1 to ~0.4:
The F1 score remains high and stable between thresholds of 0.4 and 0.8, suggesting the model performs reliably across this range.
Beyond threshold > 0.9:
def compute_permutation_importance(model, X: pd.DataFrame, y, metric=recall_score, n_repeats=10):
# Get baseline recall
baseline_preds = (model.predict(X).flatten() > 0.5).astype(int)
baseline_score = metric(y, baseline_preds)
importances = []
feature_names = X.columns
for col in feature_names:
score_drops = []
for _ in range(n_repeats):
X_permuted = X.copy()
X_permuted[col] = np.random.permutation(X_permuted[col].values)
permuted_preds = (model.predict(X_permuted).flatten() > 0.5).astype(int)
permuted_score = metric(y, permuted_preds)
score_drops.append(baseline_score - permuted_score) # drop in recall
importances.append(np.mean(score_drops))
return pd.Series(importances, index=feature_names).sort_values(ascending=False)
# Run on validation data
feature_importances = compute_permutation_importance(
final_model,
pd.DataFrame(X_test, columns=X_test.columns),
y_test,
metric=recall_score
)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 264us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 225us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 216us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 247us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 246us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 219us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 247us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 507us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 244us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 360us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 308us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 259us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 353us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 251us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 245us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 254us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 262us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 244us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 244us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 245us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 261us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 247us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 246us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 246us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 245us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 225us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 228us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 228us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 248us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 246us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 213us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 226us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 205us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 207us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 282us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 214us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 217us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 223us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 215us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 221us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 202us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 197us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 208us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 223us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 244us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 244us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 668us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 245us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 230us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 231us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 233us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 232us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 241us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 242us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 236us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 234us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 248us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 243us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 235us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 244us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 237us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 283us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 240us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 229us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 238us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 239us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 249us/step 157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 255us/step
# Plot
plt.figure(figsize=(12, 6))
feature_importances.sort_values().plot(kind='barh')
plt.title("Permutation Feature Importances (based on Recall Drop)")
plt.xlabel("Drop in Recall when Feature is Permuted")
plt.tight_layout()
plt.show()
Overall, the model’s predictive power for failure detection is concentrated in a select set of features, which offers clear guidance for future sensor design, model compression, and interpretability efforts.
Confusion matrix on test data shows:
Resulting benefits:
The Adam 4L Dropout (0.2) model shows strong reliability, balancing precision and recall, with:
It's a highly deployable model for failure prediction in wind turbines, capable of powering intelligent maintenance pipelines with both performance and interpretability.